Knowledge graph revision in the context of unknown knowledge

The role of knowledge graph encompasses the representation, organization, retrieval, reasoning, and application of knowledge, providing a rich and robust cognitive foundation for artificial intelligence systems and applications. When we learn new things, find out that some old information was wrong, see changes and progress happening, and adopt new technology standards, we need to update knowledge graphs. However, in some environments, the initial knowledge cannot be known. For example, we cannot have access to the full code of a software, even if we purchased it. In such circumstances, is there a way to update a knowledge graph without prior knowledge? In this paper, We are investigating whether there is a method for this situation within the framework of Dalal revision operators. We first proved that finding the optimal solution in this environment is a strongly NP-complete problem. For this purpose, we proposed two algorithms: Flaccid_search and Tight_search, which have different conditions, and we have proved that both algorithms can find the desired results.


Introduction
A knowledge graph is a graphical structure used to organize and represent knowledge.As artificial intelligence (AI) advances, the importance of knowledge graph in the field of AI is becoming increasingly prominent [1].It is a network composed of entities (such as people, places, and things) and how they're connected.It's meant to help computer systems understand and figure out information better.The goal of a knowledge graph is to simulate the human knowledge system, enable computers to process information in a more intelligent way.It serves not only as a tool for storing and retrieving knowledge but also supports applications such as semantic search, automatic question answering, and recommendation systems.By integrating knowledge from different domains into a unified structure, knowledge graphs contribute to building more comprehensive and accurate data models, providing robust support for artificial intelligence and big data analytics.
As time progresses, information in the real world continually evolves and changes, leading to the potential obsolescence or inclusion of inaccurate data within the knowledge graph.To ensure the knowledge graph reflects the most current real-world situations, regular revisions and updates are essential.Revision may involve adding new entities and relationships, deleting outdated or incorrect information, and adjusting the weights and associations of existing data.Through such revision processes, the knowledge graph can better serve user needs by providing an accurate and reliable knowledge foundation, supporting various application domains such as search, recommendation systems, and intelligent decision-making.
The propositional logic formulas can be used to represent relationships and entities in knowledge graph [2].In propositional logic, proposition symbols are used to represent different propositions or statements, and logical operators are used to express relationships between them.For instance, Let's we have the following knowledge graph: • There is a relationship R between entity A and entity B, then it can be represented as R (A !B).
• There is a relationship S between entity B and entity C, then it can be represented as S (B !C).
In this paper, we are concerned with the problem that the knowledge graph can be revisied by introduction new entities and relationship in the environment that the knowledge is unknown.Suppose that we have n knowledge graphs, each with a set of current knowledge graph K i and an epistemic goal ψ i , in which K i ¼ K 0 i [ K 00 i , K 0 i where K 0 i is the knowledge graph i that has been konwn, and K 00 i is the knowledge graph i that has not been konwn.We are interested in finding a single formula ϕ such that each knowledge graph K i can achieve the corresponding goal ψ i after receive the formula ϕ.In this paper, we will research the following problem: 1. We are conducting research on the Knowledge Graph Revision Problem with Limited Knowledge (LKG) problem and establishing the LKG problem model.
2. We are researching the complexity of the LKG problem.

3.
To address this problem, we have proposed two approximate search algorithms: Flaccid_search and Tight_search, and analyzed the performance and characteristics of these two algorithms.

Dynamic epistemic logic
The researchers in [3] have proposed that Dynamic Epistemic Logic (DEL) refers to a class of modal logics for reasoning about actions and belief.The researchers in [4] have proposed that the announcement problem is addressed in arbitrary public announcement logic (APAL).The researchers in [5] have proposed that the model checking problem of APAL is in PSPACE.The researchers in [6] have proposed that the model checking of APAL in its succinct form is NEXPTIME complete.The researchers in [7] have proposed that the relevance of such symbolic models and show that the symbolic model checking problem against PAPL is a EXPTIME-complete as soon as announcement protocols allow for either arbitrary announcements or iteration of public announcements.The researchers in [8] have proposed that the satisfiability problem of APAL is undecidable, but gets decidable when announcement sare propositional [9].The researchers in [10] have proposed that one highly influential approach to belief revision is the AGM approach, in which revision is captured by an operator thatsatisfies a particular set of rationality postulates.The researchers in [11] have proposed that every AGM revision operator * has the property that, for any belief set K, there is an underlying total pre-order � K over interpretations of P such that jK * �j ¼ min � K ð�Þ.The researchers in [12] have proposed the Dalal operator denoted by * d , which underlying ordering is defined by the Hamming distance between interpretations.The researchers in [13] have proposed that with some revision operators, agents can be manipulated to believe a target formula ϕ exclusively through indirect statements or evidence.[14] considered a related problem called inverse revision (IR): when is K i * i ϕ = ψ i for all agents ipossible, where and ϕ any AGM-style revision operators * i can be chosen; the set of all such ϕ is called the frame, which is complete if any ϕ and ϕ 0 are logically equivalent.Checking whether a formula ϕ is in the frame is co-NP-complete, and deciding frame-completeness is P n 2 -complete.The researchers in [15] have proposed that the not necessarily truthful public announcements in the setting of AGM belief revision and proved that announcement finding in this setting is not only decidable, but that it is simpler than the corresponding problem in the most simplified modal logics.The researchers in [16] have studied the problem of the existence of such an announcement in the context of model-preference definable revision operators.

Knowledge Graph (KG)
Knowledge representation has a long history of development in the fields of logic and AI.As artificial intelligence advances, the importance of knowledge graph in the field of AI is becoming increasingly prominent [1].Firstly, integrating deep learning with knowledge representation and reasoning can enable systems to better understand and utilize vast amounts of knowledge [17].For example, using deep learning to extract knowledge from large text data and representing it in a computationally processable form to support construction of Static KGs and decision-making [18].Many researchers utilize neural network architectures, such as CNNs.The researchers in [19] employ convolutional layers and fully-connected layers to train knowledge graph embedding vectors.The researchers in [20] utilize standard convolutions, atrous convolutions, and residual networks in its neural network architecture.The researchers in [2] have proposed three types of negatives: in-batch negatives, pre-batch negatives, and selfnegatives which act as a simple form of hard negatives.Combined with InfoNCE loss, the proposed model SimKGC can substantially outperform embedding-based methods on several benchmark datasets.The researchers in [21] have proposed that new capabilities to the model, enabling it to process texts in various domains such as geographical areas, transportation, organizations, literary works, biology, natural sciences, astronomical objects, and architecture.Secondly, the fusion of different forms of representation is key to enhancing the capability of knowledge representation and reasoning.Logical representation is suitable for reasoning and decision-making, probabilistic graphical models are suitable for uncertainty reasoning, and graph structures are suitable for representing complex relationships, among others.Integrating different forms of representation can provide a more comprehensive and flexible capability for knowledge representation and reasoning [22].For example, temporal KGs including time information in their data which can constantly gather and update information to maintain the currency of knowledge [23].Thirdly, incorporating background knowledge can enhance the accuracy and robustness of systems [24].For example, using ontology technology to construct domain ontologies and combining them with inference mechanisms to support deeper and more precise reasoning [25].Fourthly, as the scalability of reasoning improves, the computational complexity of reasoning also increases [15].Therefore, researching more efficient reasoning algorithms and technologies to improve the speed and scalability of reasoning is crucial [26].For example, using parallel computing and distributed computing technologies can accelerate the reasoning process [27].Finally, incorporating human knowledge and judgment can construct AI systems with more human-like intelligence [28].By collaborating with domain experts and integrating their experience and intuition into knowledge representation and reasoning, more accurate and trustworthy reasoning results can be provided [29].Therefore, combining deep learning, integrating different forms of representation, incorporating background knowledge, improving reasoning efficiency and scalability, and incorporating human knowledge and judgment can drive greater breakthroughs and progress in knowledge representation and reasoning in AI systems [30].However, in many fields, acquiring training data for deep learning faces challenges.Particularly in cases involving personal privacy, trade secrets, or requiring extensive annotation, data acquisition is subject to strict legal and ethical requirements.Fields such as healthcare, military and intelligence, finance, and natural disaster prediction fall into this category of hard-to-obtain data.In these fields, protecting the privacy and security of data is crucial, requiring additional measures to ensure the lawful use of data [31].

Preliminaries
A Knowledge Graph is a formalized framework for representing knowledge, with its core concept being the modeling of real-world information in a graphical structure.In a formal representation, a knowledge graph can be defined as a triple set G = E, R, T, where: • E represents the set of entities, denoting individuals, objects, or concepts in the real world.
• R represents the set of relations, describing the semantic associations between entities.
• T represents the set of triples, with each triple (t, r, t 0 ) representing the connection between two entities through relation r.
The Knowledge Revision Problem is that there have n agents and n AGM knowledge revision operators * i input; we are looking for the existence of a consistent formula ϕ such that [15] where K i and ψ i represent the knowledge graph set and the goal of the agent i, respectively.Obviously, if V i ψ i is consistent, then we can just revise by this [15].In some cases, however, not all the knowledge graph set of agents are completely known, then where K 0 i is the the knowledge graph set of agent i that has been konwn, and K 00 i is the the knowledge graph set of agent i that has not been konwn.
Example 1 Consider the UAV controller example over the vocabulary patrol; exitinguishing.We think of this vocabulary as defining a state machine, where each interpretation represents a state; the UAV can have actions that are triggered by transitions to given states.This is a standard control mechanism for simple agents in a video game setting, and it can function as a control mechanism for our simple UAVs as well.In our example, the patrol variable is true when the UAV should be patrolling their area and the exitinguishing variable is true when the UAV receive an alarm due to fire.
Suppose we have two UAVs U 1 and U 2 with initial knowledge states defined as follows: The controller receive an alarm due to fire.Is there a formula that can be broadcast to immediately get U 1 to exitinguishing the fire while U 2 keep patrolling?In other words, is there a formula ϕ such that: The answer is yes; we can set ϕ = patrol.
The preceding example is framed in the context of the UAV controller, but it also demonstrates an important case for propositional announcement.In particular, it shows that there are cases where the goals are inconsistent, yet a solution is possible.The researchers in [15] have proposed the EXIST_ANN Algorithm to solve this problem.The process of the EXIS-T_ANN is described in the Algorithm 1. Reject.10: Reject.
The EXIST_ANN Algorithm is effective for solving the Propositional Announcement Problem, however, in some environments, the initial knowledge set of agents cannot be known.In this environment, the EXIST_ANN Algorithm cannnot obtain an effective solution.Next, we will explain in detail with the Example 2.
Example 2 Assume the vocabulary {a, b, c}, the knowledge et of the agent A 1 is K 1 and the knowledge set of the agent , where for i 2 f1; 2g; K 0 i is the the knowledge set of agent A i that has been konwn, and K 00 i is the the knowledge set of agent A i that has not been konwn.let K 0 Through the EXIST_ANN Algorithm, we can conclude that d 1 = 1, d 2 = 1, and there exists a solution that ϕ = ¬a ^¬b ^c.This is a wrong solution because the minimum hamming distance d min (K 1 , ϕ) = 2. Therefore, the EXIST_ANN Algorithm cannot deal with the problem in the environment that the initial knowledge set of agents are unknown.

Problem definition
In this section, we restrict the problem slightly by requiring that all agents have the same revision operator.That is given n agents and an AGM revision operator *, we are looking for if there exists a consistent formula ϕ such that where K i and ψ i represent the knowledge set and the goal of the agent i, respectively.In principle, * could be any shared revision operator.In some cases, however, not all the knowledge set of agents are completely known, thus where K 0 i is the the knowledge set of agent i that has been konwn, and K 00 i is the the knowledge set of agent i that has not been konwn.In practice, we will often be interested in the complexity of finding announcements; but we must first consider the corresponding decision problem.The formal definition is as follows: Definition 4.1 The Knowledge Graph Revision Problem with Limited Knowledge Input: An integer n A list of the initial knowledge K 0

formulas (goals).
Ouput: Yes, if there exists ϕ satisfying Eq 2 No, otherwise.We refer to this problem as LKG(*), which emphasizes that it depends on some given operator * on knowledge sets.We normally assume that * is an AGM revision operator, but this need not be the case in general.
Next, we still need to further explore the complexity of the LKG(* d ) problem.This problem can be seen as two stages: the stage one is the estimation of v i ; the stage two is to check the the minimal Hamming distance between v i and K 0 i .

Problem solution
For the moment, we are interested in analyzing a simple case to obtain the most efficient algorithm possible.Hence, we consider Dalal's well-known revision operator based on Hamming distance [12].In this section, we let * d denote Dalal's revision operator.We present the Flaccid_search Algorithm and the Tight_search Algorithm to solve the LKG (* d ) problem.In the algorithm, we use non-determistic choice to select the minimum distance d i between K 0 i and ψ i .Also note that d(w, v) denotes the Hamming distance between the set of models of interpretation w and interpretation v.The process of the Flaccid_search algorithm has been shown in Algorithm 2.

11: end if 12: end for
The Line 1 of Algorithm 2 is estimate the value of m according to K 0 1 ; K 0 2 � � � K 0 n and ψ 1 , ψ 2 � � �ψ n .(i.e.m is the number of vocabulary of K 0 2 � � � K 0 n and ψ 1 , ψ 2 � � �ψ n ), The Line 2 of Algorithm 2 is estimate the value of d i , the value of which is {0, 1� � �m}.The Line 3 of Algorithm 2 is estimate the value of v i .The Line 4 to Line 12 of Algorithm 2 is the process of judging whether v i is acceptable, if it is acceptable, add v i to ϕ, and at the initial state ϕ is an empty set.However, although the judgment condition of the Algorithm 2 is relatively loose, there will still be situations where the Algorithm 2 has no solution.Next, we will explain in detail through the Proposition 4.1.
Proposition 4.1 There exist instances that the Algorithm 2 has no solution.
Next, We need to prove that if there exists ϕ satisfying the LKG(* d ), then the Flaccid_search can produce the accepted result.In the proof, we write d(x i , v i ) for the minimal Hamming distance from a model of v i to a model of accepts if there exists ϕ such that K i * d ϕ ⊨ ψ i for each i.Proof.Because there exists ϕ such that K i * d ϕ ⊨ ψ i for each i, according the Eq 2, for all i 2 {1, 2� � �n}, there exists v i ⊨ K i and d(x i , v i ) = d i .Thus the Algorithm 2 will be accepted.
Let m be the size of the underlying input vocabulary of else Reject.The process of the Tight_search Algorithm is similar to the the Flaccid_search Algorithm, except the more stringent judgment conditions.The process of the Flaccid_search algorithm has been shown in Algorithm 3.However, although the judgment condition of the Algorithm is strict, there will still be situations where the Algorithm 3 has a solution.Next, we will explain in detail through the Proposition 4.2.
Proposition 4.2 There exist instances that the Algorithm 3 has a solution.
Next, we need to prove that if the Tight_search(K 1 , K 2 � � �K n ; ψ 1 , ψ 2 � � �ψ n ) produce the accepted result ϕ, then ϕ can satisfy the LKG(* d ).In the proof, we write d(x i , v i ) for the minimal Hamming distance from a model of v i to a model of x i .

Theorem 4.2 Let
accepts then there exists ϕ such that K i * d ϕ ⊨ ψ i for each i.Proof.Because for all i 2 {1, 2� � �n}, K i is consitent, so there exists x i ⊨ K i and K i ¼ K 0 i [ K 00 i , so x i � K 0 i .Since the Algorithm 3 has been accepted, then 8w � K 0 i ; dðw; v i Þ ¼ d i , so we can conclude that d(x i , v i ) = d i , thus there exists ϕ such that K i * d ϕ ⊨ ψ i for each i.
Although the Flaccid_search Algorithm and the Tight_search Algorithm can obtain the desierd solution, we still need to further explore the complexity of the LKG(* d ) problem.This problem can be seen as two stages: the stage one is the estimation of v i ; the stage two is to check the the minimal Hamming distance between v i and In the initial estimation of v i , since the possibility of the v i value is exponential, it can be verified by the non-deterministic Turing machine, so this problem is a NP problem.
After the initial estimation, we need to judge if dðK 0 i ; v i Þ < d i , but recall that dðK 0 i ; v i Þ represents the the mininum Hamming distance here, so this check can be performed as follows.Guess a set of atomic propositional variables of size less than d i , and let v 0 i be the interpretation obtained from v i by switching the truth values of these variables.If v 0 i � K 0 i , then the minimum distance between K 0 i and v i is less than d i , since we need to obtain the minimal Hamming distance, so we must consider all possible value of v 0 i , this process is a co − NP problem.Hence the entire LKP(* d ) problem is the summary of a NP problem and a co − NP problem, so the complexity of which is Recall that the complexity of the PAP(* d ) problem is P P 2 [15].Compared with the PAP(* d ) problem, the LKG(* d ) problem is more complex (i.e. the initial knowledge set of agents is unkonwn) and has more application value, but through our algorithm, the LKG(* d ) problem can be efficiently solved, which is the desired result.

Example
In this section, we are interested in a different question: can we use announcement finding to implement a feasible solution to a problem of practical interest?To answer this question, we describe a practical tool that uses announcement finding as the basis for a simulated robot controller.
Alice believes that eggs should be eaten every day, and milk cannot be drunk with yogurt, but people don't know that Alice thinks that they should drink milk when they eat eggs.Bob believes that yogurt should be drunk and egg should not be eaten.We want to find an announcement that makes Alice believe in milk can be drunk with yogurt, and Bob believes in egg can be eaten with yogurt, and change their original knowledges as little as possible.For the above environment, we can build a model that the knowledge set of Alice is K Alice and the knowledge set of Bob is K Bob , then the initial knowledge states defined as follows: After receive the announcement, the knowledge of Alice become that milk can be drunk with yogurt and knowledge of Bob become that egg can be eaten with yogurt.Therefore ψ Alice = {milk ^yogurt} and ψ Bob = {yogurt ^egg}.We want to find a formula ϕ such that: The input of the algorithm is K 0 Alice and K 0 Bob , by executing the Algorithm 3, if we get v 1 = ¬egg ^¬milk ^yogurt, v 2 = ¬egg ^yogurt and d 1 = 1, d 2 = 0, then according to the line 5 of the Algorithm 3, 9w � K 0 Alice , d(w, v 1 ) 6 ¼ d 1 , so this solution is rejected.If we get v 1 = egg ^¬milk ^¬yogurt, v 2 = egg ^¬yogurt and d 1 = 1, d 2 = 2, then according to the line 6 of the Algorithm 3, v 1 ⊭ ψ 1 and v 2 ⊭ ψ 2 , so this solution is rejected.If we get v 1 = egg ^milk ^yogurt, v 2 = egg ^yogurt and d 1 = 1, d 2 = 1, then according to the line 5 of the Algorithm 3, 8w � and according to the line 6 of the Algorithm 3, v 1 ⊨ ψ Alice and v 2 ⊨ ψ Bob , thus ϕ = v 1 ^v2 , so this solution is accepted.

Experiment
In this section, we will validate the previously proposed Flaccid_search algorithm and Tight_search algorithm.Firstly, we need to provide the source of the dataset and the evaluation metrics.Then, we will design a series of experiments to verify their effectiveness and performance by comparing their performance on different datasets.Finally, we will analyze the experimental results, discuss the advantages and disadvantages of these two algorithms, and explore possible directions for improvement.Through these validations and analyses, we can better understand the characteristics of these two algorithms and provide guidance and reference for their practical applications.

Datasets
In this experiment, we utilized a publicly available dataset of Bitcoin OTC trust weighted signed network as the basis for our research, which cotain 6000 users and more than 35000 records.This dataset contains a wealth of information regarding who-trusts-whom network of people, which is pivotal to our investigation into Flaccid_search algorithm and Tight_search algorithm.

Implementation details
From this dataset, we extracted the score of trust and conducted data preprocessing and cleansing using Binary conversion.Subsequently, we employed Flaccid_search algorithm and Tight_search algorithm for data analysis and modeling, yielding several intriguing findings.All experiments were implemented using Python 3.9 and executed on an Apple M1 processor.

Comparison and evaluation
In the context of knowledge graph revision, our algorithm evaluation focuses on the existence rate (ER) and applicability rate (AR) of solutions.Specifically, we propose the Flaccid_search algorithm, which primarily focuses on ER, and the Tight_search algorithm, which primarily focuses on AR.This explicit classification aids in a deeper understanding of the performance characteristics of different algorithms in knowledge graph revision.
Based on the analysis of Table 1, it can be observed that as the parameter d varies, the Flac-cid_search algorithm we proposed consistently outperforms the Exist_Ann algorithm in terms of the ER metric.This indicates that the Flaccid_search algorithm has a higher probability of finding solutions, demonstrating its superior practicality.Moreover, a more detailed examination reveals that the Exist_Ann algorithm exhibits a basically stable trend in ER values as d increases, whereas the Exist_Ann algorithm shows relatively lower and more fluctuating ER values.This trend suggests that the Flaccid_search algorithm is more reliable in finding solutions under different parameter settings, whereas the Exist_Ann algorithm's solution capability is relatively unstable.Furthermore, a closer analysis reveals that the Exist_Ann algorithm maintains high ER values even at higher values of d, indicating its robustness and stability in finding solutions, even under complex conditions.In contrast, the Exist_Ann algorithm performs poorly at higher values of d, with significantly lower ER values, suggesting potential difficulties in finding solutions under complex conditions.In summary, the Flaccid_search algorithm we proposed is likely to have a higher probability of finding solutions in practical applications, demonstrating greater practicality and reliability, particularly when faced with complex conditions and parameter settings.
The Table 2 presents the average value of AR as variable d changes, while the Figs 1 to 3 illustrates the distribution of AR across different users as variable d changes.From the Table 2 and the Graph 1 to Graph 3, we can see that the Tight_search algorithm we propoed is clearly more widely distributed than the Exist_Ann algorithm in the high-accuracy interval.Moreover, it remains stable with the change of d.This indicates that the Tight_search algorithm exhibits better stability and consistency under these conditions and demonstrates a higher robustness to parameter d variations.Therefore, based on these observations, we can tentatively infer that the Tight_search algorithm may perform better on this problem.

Conclusion and future work
We have considered knowledge graph revision problem, in the context of the initial knowledge is unknown.In this setting, the past work can not come up with effective solutions, so we are conducting research on the Knowledge Graph Revision Problem with Limited Knowledge (LKG) problem and establishing the LKG problem model.To address this problem, we have proposed two approximate search algorithms: Flaccid_search and Tight_search, and analyzed the performance and characteristics of these two algorithms, we have proved that both the Flaccid_search Algorithm and the Tight_search.Algorithm can obtain the desired result.Finally, we explain the algorithm through an example and prove the effectiveness of the algorithm in solving knowledge graph revision problem.
In the future, we will look into more complicated situations, like modal logical settings or dynamic cognitive logical settings.We'll work on making sure the entities and connections in knowledge graphs are more accurate and consistent.This means we'll improve methods like connecting entities and extracting relationships.Also, we'll concentrate on upgrading applications, especially in fields like healthcare and finance.